Building Word-Emotion Mapping Dictionary for Online News
نویسندگان
چکیده
Sentiment analysis of online documents such as news articles, blogs and microblogs has received increasing attention. We propose an efficient method of automatically building the word-emotion mapping dictionary for social emotion detection. In the dictionary, each word is associated with the distribution on a series of human emotions. In addition, three different pruning strategies are proposed to refine the dictionary. Experiment on the real-world data sets has validated the effectiveness and reliability of the method. Compared with other lexicons, the dictionary generated using our approach is more adaptive for personalized data set, language-independent, fine-grained, and volume-unlimited. The generated dictionary has a wide range of applications, including predicting the emotional distribution of news articles and tracking the change of social emotions on certain events over time.
منابع مشابه
Building Monolingual Word Alignment Corpus for the Greater China Region
For a single semantic meaning, various linguistic expressions exist the Mainland China, Hong Kong and Taiwan variety of Mandarin Chinese, a.k.a., the Greater China Region (GCR). Differing from the current bilingual word alignment corpus, in this paper, we have constructed two monolingual GCR corpora. One is a 11,623-triple GCR word dictionary corpora which is automatically extracted and manuall...
متن کاملDevelopment and Evaluation of Tagalog Linguistic Inquiry and Word Count (LIWC) Dictionaries for Negative and Positive Emotion
Introduction As the use of online and social media increases globally, the need for sentiment analysis tools in multiple languages is critical in order to understand and analyze the vast amount of data that may contain users’ feelings, perceptions, and beliefs. Users from different countries convey their messages in various languages, which may convey different sentiments and cultural connotati...
متن کاملA Multi-Lingual Dictionary of Dirty Words
We present a multi-lingual dictionary of dirty words. We have collected about 3,200 dirty words in several languages and built a database of these. The language with the most words in the database is English, though there are several hundred dirty words in for instance Japanese too. Words are classified into their general meaning, such as what part of the human anatomy they refer to. Words can ...
متن کاملSpeaker-independent Natural Arabic Speech Recognition System
This paper reports the results of the first phase of a research work for building a high performance, speaker-independent natural Arabic speech recognition system. This work aims at developing an Arabic broadcast news transcription system and a base system for further research. Several concurrent recent advances in Arabic language processing were crucial for the success of this stage, e.g autom...
متن کاملLost in Translations? Building Sentiment Lexicons using Context Based Machine Translation
In this paper, we propose a simple yet efective approach to automatically building sentiment lexicons from English sentiment lexicons using publicly available online machine translation services. The method does not rely on any semantic resources or bilingual dictionaries, and can be applied to many languages. We propose to overcome the low coverage problem through putting each English sentimen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012